Pointwise ROC Confidence Bounds: An Empirical Evaluation
نویسندگان
چکیده
This paper is about constructing and evaluating pointwise confidence bounds on an ROC curve. We describe four confidencebound methods, two from the medical field and two used previously in machine learning research. We evaluate whether the bounds indeed contain the relevant operating point on the “true” ROC curve with a confidence of 1−δ. We then evaluate pointwise confidence bounds on the region where the future performance of a model is expected to lie. For evaluation we use a synthetic world representing “binormal” distributions–the classification scores for positive and negative instances are drawn from (separate) normal distributions. For the “true-curve” bounds, all methods are sensitive to how well the distributions are separated, which corresponds directly to the area under the ROC curve. One method produces bounds that are universally too loose, another universally too tight, and the remaining two are close to the desired containment although containment breaks down at the extremes of the ROC curve. As would be expected, all methods fail when used to contain “future” ROC curves. Widening the bounds to account for the increased uncertainty yields identical qualitative results to the “true-curve” evaluation. We conclude by recommending a simple, very efficient method (vertical averaging) for large sample sizes and a more computationally expensive method (kernel estimation) for small sample sizes.
منابع مشابه
Semi-Empirical Likelihood Confidence Intervals for the ROC Curve with Missing Data
The receiver operating characteristic (ROC) curve is one of the most commonly used methods to compare the diagnostic performances of two or more laboratory or diagnostic tests. In this thesis, we propose semi-empirical likelihood based confidence intervals for ROC curves of two populations, where one population is parametric while the other one is non-parametric and both populations have missin...
متن کاملExact maximum coverage probabilities of confidence intervals with increasing bounds for Poisson distribution mean
A Poisson distribution is well used as a standard model for analyzing count data. So the Poisson distribution parameter estimation is widely applied in practice. Providing accurate confidence intervals for the discrete distribution parameters is very difficult. So far, many asymptotic confidence intervals for the mean of Poisson distribution is provided. It is known that the coverag...
متن کاملConfidence Bands for ROC Curves: Methods and an Empirical Study
In this paper we study techniques for generating and evaluating confidence bands on ROC curves. ROC curve evaluation is rapidly becoming a commonly used evaluation metric in machine learning, although evaluating ROC curves has thus far been limited to studying the area under the curve (AUC) or generation of one-dimensional confidence intervals by freezing one variable—the false-positive rate, o...
متن کاملJackknife Empirical Likelihood Based Confidence Intervals for Partial Areas Under ROC Curves
The partial area under the ROC curve (partial AUC) summarizes the accuracy of a diagnostic or screening test over a relevant region of the ROC curve and represents a useful tool for the evaluation and the comparison of tests. In this paper, we propose a jackknife empirical likelihood method for making inference on partial AUCs. Following the idea in Jing, Yuan, and Zhou (2009), we combine the e...
متن کاملRoc Analysis in the Evaluation of Intelligent Medical Systems
A large number of intelligent medical systems exist, but few are in routine clinical use. This is due, in part, to a lack of a robust objective method to quantify the performance of such systems. Potentially, ROC analysis could form a basis for a robust and objective evaluation of intelligent medical systems, but existing methods of ROC analysis require large sample sizes to be statistically va...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2005